Current Topics in Artificial Intelligence: Regularization

نویسنده

  • Chenxi Liu
چکیده

This short survey discusses the role of regularization in current deep learning research. Up till now, dropout remains the most popular choice of all deep neural network regularization techniques, and is what this survey is centered around. We first give a general introduction and interpretation of dropout (Section 1), followed by some follow-up works which either improve speed or offer analysis (Section 2). Then methods other than dropout, namely DropConnect and Batch Normalization, are introduced (Section 3) to provide comparison and contrast. Finally the necessity of regularization is argued (Section 4). This is the second of the four short surveys. 1 Dropout Regularization is important for deep neural networks, because they usually have millions of parameters to learn, and given the limited training data, overfitting is quite likely to happen. Dropout [1] stood out as a simple yet effective regularization technique, and first attracted major attention when it greatly alleviated the overfitting problem of AlexNet [2], the ILSVRC 2012 winner. 1.1 Formulation and Intuition The idea of dropout is quite simple. It introduces stochasticity into the neural network by (temporarily) dropping random units out, along with all its incoming and outgoing connections. This is equivalent to sampling a “thinned” network from the complete architecture. Concretely: • During each training iteration, drop each unit in a layer out with probability p (usually set to 0.5, which means randomly dropping half of the units in the current layer). • During testing, multiply all the outgoing weights of the units by p and make final prediction using all the units. The intuition behind dropout is as follows. If the capacity of the network is much larger than the data it is representing, then overfitting is likely to happen. In terms of weights, this could mean that a large number of weights have learned to “cooperate” or “co-adapt” to fit the data point. These complex co-adaptations could go wrong on the new test data. On the other hand, by randomly dropping half of the units, a unit is forced to work with different “co-workers” during each iteration of training. As a result, it is likely to learn something that is individually useful. But after learning is done, we no longer want to use the “thinned” networks and would like to utilize all the knowledge that the network has learned. Since the weights of the next layer is learned when we drop out half of the units in the current layer, if we fix the outgoing weights of all the units, then

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Artificial Intelligence Based Approach for Identification of Current Transformer Saturation from Faults in Power Transformers

Protection systems have vital role in network reliability in short circuit mode and proper operating for relays. Current transformer often in transient and saturation under short circuit mode causes mal-operation of relays which will have undesirable effects. Therefore, proper and quick identification of Current transformer saturation is so important. In this paper, an Artificial Neural Network...

متن کامل

Nonlinear System Identification for a DC Motor using NARMAX Model with Regularization Approach

The approach to the design of direct current (DC) motor varies considerably using advanced methods such as artificial intelligence (AI). However, accuracy issues cannot be totally addressed using conventional methods. This paper presents the study on nonlinear autoregressive moving average with exogenous input (NARMAX) model using multilayer perceptron (MLP) neural network for DC motor modeling...

متن کامل

A New Restricted Earth Fault Relay Based on Artificial Intelligence

The restricted earth fault (REF) relay is a type of differential protection which is used for detection of internal ground faults of power transformers. But, during external faults and transformer energization conditions, the probability of current transformer (CT) saturation increases. Thus, the spurious differential current due to CT saturation, can lead to REF relay maloperation. In this pap...

متن کامل

Forecasting of heavy metals concentration in groundwater resources of Asadabad plain using artificial neural network approach

Nowadays 90% of the required water of Iran is secured with groundwater resources and forecasting of pollutants content in these resources is vital. Therefore, this research aimed to develop and employ the feedforward artificial neural network (ANN) to forecast the arsenic (As), lead (Pb), and zinc (Zn) concentration in groundwater resources of Asadabad plain. In this research, the ANN models we...

متن کامل

Artificial Intelligence for Space Applications

The ambitious short-term and long-term goals set down by the various national space agencies call for radical advances in several of the main space engineering areas, the design of intelligent space agents certainly being one of them. In recent years, this has led to an increasing interest in artificial intelligence by the entire aerospace community. However, in the current state of the art, se...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016